Lecture 5 - Hypothesis Tests
Argument, Data, and Politics: POLS 3312
2024-02-07
z-score
- normal distribution
- sample size above 30
- easiest formula
- easy to remember significance levelsz-score
Student’s t-Test
- similar to normal distribution
- small sample sizes
- different formula, a little harder to calculate by hand
- significance depends on sample size
- have to consult tablez-score
Student’s t-Test
Chi-Square Test
- categorical variables
- different formula
- significance depends on degrees of freedom (d.f.)
- d.f. is a function of sample size and number of variables (categories)
- have to consult tableDifferent distributions mean different probabilities
Normal and t-distributions
Comparison of normal distribution to t-distribution with sample size 15
Chi-square distribution
- Chi-square distribution with different degrees of freedom
- Degrees of freedom is a function of sample size and number of variables (categories)
68-95-99.7 rule
Based on the standard normal distribution
Based on the 69-95-99.7 rule
Measures how many standard errors a value is from the mean
Can be used to test:
- hypotheses about the mean of a population
- hypotheses about the difference between two means (two groups with identical distributions)
- hypotheses about the difference between a mean and a value
- hypotheses about the difference between two proportionsSample Size Matters
A lot!
\(\frac{s}{\sqrt{n}}\)
\(\frac{s}{\sqrt{n}}\)
So what happens as sample size increases?
As sample size increases, the standard error decreases, holding everything else constant.
Standard Error is the standard deviation of the sample means.
- If we do 1000 trials of random, indendepent, identically distributed variables (random IID variables) from any distribution
- The means of each trial are the sample means
- Central Limit Theorem tells us that the distribution of the sample means will converge to a normal distribution Number of standard errors from the mean
Probability that actual population parameter is approximately equal to sample statistic
If we know the sample mean, \(\bar{x}\), is 50
standard error, \(\sigma\), is 1
We want to locate the population mean, \(\mu\)
68-95-99.7 Rule
68-95-99.7 Rule
68-95-99.7 Rule
Confidence Interval
\(z = \frac{x - \mu}{\sigma}\)
- x is the raw score, μ is the population mean, and σ is the population standard deviationThe Confidence Interval with the Z-Score is sample mean \(\pm\) the Margin of Error which we get from (just for illustration at this point):
pairwise comparison: what are the pairs?
- one sample: comparing one group against a standard value
- two-sample or independent t-test: compares two groups from different populations
- paired t-test: compares a single group as in before and after comparisonOne or two tails
- Two tailed test: tells if they are different, either greater or less
- One tailed test: tells if one group is specifically greater or less, bot not eitherOther points:
- degrees of freedom = n - 1
- When t-test degrees of freedom > 30, it converges on the z-score
- t-test is more conservative than z-score
t vs z dist
Continuous variables
normal distribution
- Central Limit Theorem can get us to normal distributionknown population standard deviation
- "known" ~ accepted estimate of the population standard deviation from LLN and CLTUse if: if the population standard deviation is known or reliably estimated and sample size > 30
Group 1: (12.2, 14.6, 13.4, 11.2, 12.7, 10.4, 15.8, 13.9, 9.5, 14.2) Group 2: (13.5, 15.2, 13.6, 12.8, 13.7, 11.3, 16.5, 13.4, 8.7, 14.6)
More on reading t-tables plus 1- and 2- tailed tables here:
https://www.statisticshowto.com/tables/t-distribution-table/
Frequency distribution
Used in hypothesis testing
Categorical variables
Ideal sample size 50- 1000
+ Less than 10 very unreliableTwo hypothesis tests
- Test of goodness of fit (1 variable)
- **Test of independence (2 variables)*Categorical variables
+ Testing if categories are related
+ Men/women, Republicans/Democracts, Northerners/Southerners, Black/White2nd variable is the thing we look for a difference in
+ Pay, attitudes toward taxes, attitudes toward military service, victimization by policeVisualization of Chi-Square
Chi Square Distribution
2x2 table
- 2 rows, 2 columns
- 2 variables
- 4 cellsAuthor: Tom Hanna
Website: tomhanna.me
License: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.</>
Chi-square probability distribution - By Geek3 - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=9884213
t-test probability distribution - https://simon.cs.vt.edu/SoSci/converted/T-Dist/
POLS3312, Spring 2024, Instructor: Tom Hanna